-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine Node controller to support Gateway HA #4069
Conversation
/test-multicluster-e2e |
Codecov Report
@@ Coverage Diff @@
## main #4069 +/- ##
==========================================
+ Coverage 56.09% 60.75% +4.66%
==========================================
Files 371 392 +21
Lines 53327 53927 +600
==========================================
+ Hits 29913 32766 +2853
+ Misses 21071 18716 -2355
- Partials 2343 2445 +102
|
@luolanzone : could you explain any real issue can happen without the changes? If there were indeed issues, please describe them in the commit description too. |
0a46d83
to
3b9b20d
Compare
/test-multicluster-e2e |
3b9b20d
to
2b4e9d4
Compare
df38938
to
3f94ecb
Compare
if err := r.Client.List(ctx, nodeList, &client.ListOptions{}); err != nil { | ||
return err | ||
} | ||
// Empty the cache in case of initializeActiveGateway failed and rerun. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failure -> failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see you added code to do the cleanup? Maybe simpler to move the code to the end of func before returning without no error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, add the candidates initialization before return nil.
3f94ecb
to
204f359
Compare
} | ||
klog.ErrorS(err, "Failed to get Node", "node", req.Name) | ||
return ctrl.Result{}, err | ||
return ctrl.Result{}, client.IgnoreNotFound(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should skip the previous log if not found, so I would return in line 103.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, fixed
isActiveGateway := r.activeGateway == node.Name | ||
toRecreate := false | ||
if hasGWAnnotation { | ||
r.gatewayCandidates[req.Name] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should delete the node from gatewayCandidates if no annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
204f359
to
24da6e0
Compare
8ee2fee
to
bca6819
Compare
bca6819
to
df49837
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Several more nits for you to consider.
} | ||
existingGW.GatewayIP = newGateway.GatewayIP | ||
existingGW.InternalIP = newGateway.InternalIP | ||
if err := r.Client.Update(ctx, existingGW, &client.UpdateOptions{}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment about if the Gateway version in the client cache is stale, the update operation will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
df49837
to
5ee993b
Compare
@luolanzone : please check if you can improve coverage to meet the target. |
5ee993b
to
73ad778
Compare
/test-multicluster-e2e |
In order to support Gateway active-standby mode HA, the Node controller has been refined to handle Node readiness change. There will be at most one Gateway in a given Namespace, and the Gateway controller is updated correspondingly. 1. Check Node readiness and create a Gateway CR if Node is ready and there is no existing Gateway. 2. Initilize active Gateway and Gateway candidate Nodes with annotation 'multicluster.antrea.io/gateway'. 3. Add Gateway webhook to allow at most one Gateway in a given Namespace. Signed-off-by: Lan Luo <[email protected]>
73ad778
to
49ceb32
Compare
/test-multicluster-e2e |
@jianjuns Could you help to check again? I added a few more tests, looks like the result of |
/skip-all |
…#4069) In order to support Gateway active-standby mode HA, the Node controller has been revised to handle Node readiness changes. There will be at most one Gateway in a single Namespace. 1. Check Node readiness and create a Gateway CR if Node is ready and there is no existing Gateway. 2. Initialize active Gateway and Gateway candidate Nodes with annotation 'multicluster.antrea.io/gateway'. 3. Add Gateway webhook to allow at most one Gateway in a single Namespace. Signed-off-by: Lan Luo <[email protected]>
In order to support Gateway active-standby mode HA, the Node
controller has been refined to handle Node readiness changes.
There will be at most one Gateway in a given Namespace, and the
Gateway controller is updated correspondingly.
and there is no existing Gateway.
'multicluster.antrea.io/gateway'.
Namespace.
For #3754
Signed-off-by: Lan Luo [email protected]